A Two-Level Morphological Analyser for the Indonesian Language

نویسندگان

  • Femphy Pisceldo
  • Rahmad Mahendra
  • Ruli Manurung
  • I Wayan Arka
چکیده

This paper presents our efforts at developing an Indonesian morphological analyser that provides a detailed analysis of the rich affixation process. We model Indonesian morphology using a two-level morphology approach, decomposing the process into a set of morphotactic and morphophonemic rules. These rules are modelled as a network of finite state transducers and implemented using xfst and lexc. Our approach is able to handle reduplication, a non-concatenative morphological process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word classes in Indonesian: A linguistic reality or a convenient fallacy in natural language processing?

This paper looks at Indonesian (Bahasa Indonesia), and the claim that there is no noun-verb distinction within the language as it is spoken in regions such as Riau and Jakarta. We test this claim for the language as it is written by a variety of Indonesian speakers using empirical methods traditionally used in part-of-speech induction. In this study we use only morphological patterns that we ge...

متن کامل

Indonesian Morphology Tool (MorphInd): Towards an Indonesian Corpus

This paper describes a robust finite state morphology tool for Indonesian (MorphInd), which handles both morphological analysis and lemmatization for a given surface word form so that it is suitable for further language processing. MorphInd has wider coverage on handling Indonesian derivational and inflectional morphology compared to an existing Indonesian morphological analyzer [1], along with...

متن کامل

NANYANG TECHNOLOGICAL UNIVERSITY SCHOOL OF HUMANITIES AND SOCIAL SCIENCES Creating derivational morphology links in Wordnet Bahasa

Derivational morphology links are created for the Wordnet Bahasa, a combined Indonesian and Malay online lexical dictionary (Nurril Hirfana, Suerya, & Bond, 2011). The focus was to link root words to affixed words as affixation is one of the more apparent word formation processes in Bahasa Melayu. MorphInd, an Indonesian morphological analyser (Larasati, Kubon, & Zeman, 2011), is used to breakd...

متن کامل

Towards an Indonesian-English SMT System: A Case Study of an Under-Studied and Under-Resourced Language, Indonesian

This paper describes a work on preparing an Indonesian-English Statistical Machine Translation (SMT) System. It includes the creation of Indonesian morphological analyzer, MorphInd, and the composing of an Indonesian-English parallel corpus, IDENTIC. We build an SMT system using the state-of-the-art phrase-based SMT system, MOSES. We show several scenarios where the morphological tool is used t...

متن کامل

Syntactic Underspecification in Riau Indonesian

Indonesian is known for having a relatively simple morphological and syntactic structure. This is especially true of local varieties of the language, where contrast between categories found in Standard Indonesian is neutralized. In the Indonesian variety spoken in Riau Province, there is almost no morphological marking of grammatical categories and there is relatively free word order. Gil (1994...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008